Von Mises-Fisher Clustering Models
نویسندگان
چکیده
This paper proposes a suite of models for clustering high-dimensional data on a unit sphere based on von Mises-Fisher (vMF) distribution and for discovering more intuitive clusters than existing approaches. The proposed models include a) A Bayesian formulation of vMF mixture that enables information sharing among clusters, b) a Hierarchical vMF mixture that provides multiscale shrinkage and tree structured view of the data and c) a Temporal vMF mixture that captures evolution of clusters in temporal data. For posterior inference, we develop fast variational methods as well as collapsed Gibbs sampling techniques for all three models. Our experiments on six datasets provide strong empirical support in favour of vMF based clustering models over other popular tools such as K-means, Multinomial Mixtures and Latent Dirichlet Allocation.
منابع مشابه
Hierarchical 3-D von Mises-Fisher Mixture Model
In this paper, we propose a complete method for clustering data, which are in the form of unit vectors. The solution consists of a distribution based clustering algorithm with the assumption of a generative model. In the model, the data is generated from a finite statistical mixture model based on the von Mises-Fisher (vMF) distribution. Initially, Bregman soft clustering algorithm is applied...
متن کاملmovMF: An R Package for Fitting Mixtures of von Mises-Fisher Distributions
Finite mixtures of von Mises-Fisher distributions allow to apply model-based clustering methods to data which is of standardized length, i.e., all data points lie on the unit sphere. The R package movMF contains functionality to draw samples from finite mixtures of von Mises-Fisher distributions and to fit these models using the expectation-maximization algorithm for maximum likelihood estimati...
متن کاملComputational Representation of White Matter Fiber Orientations
We present a new methodology based on directional data clustering to represent white matter fiber orientations in magnetic resonance analyses for high angular resolution diffusion imaging. A probabilistic methodology is proposed for estimating intravoxel principal fiber directions, based on clustering directional data arising from orientation distribution function (ODF) profiles. ODF reconstruc...
متن کاملClustering using EM and CEM, cluster number selection via the Von Mises-Fisher mixture models
We consider the clustering problem of directional data and specifically the choice of the number of clusters. Setting this problem under the mixture approach, we perform a comparative study of different criteria. Monte Carlo simulations are performed taking into account the overlap degree of clusters and the size of data.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014